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Abstract. A new approach is presented for the analysis of feedback processes in a 
nonlinear dynamical system by observing its variations. The new methodology' consists of 
statistical estimates of the sensitivities between all pairs of variables in the system based 
on a neural network modeling of the dynamical system. The model can then be used to 
estimate the instantaneous, multivariate and nonlinear sensitivities, which are shown to be 
essential for the analysis of the feedbacks processes involved in the dynamical system. The 
method is described and tested on synthetic data from the low-order Lorenz circulation 
model where the correct sensitivities can be evaluated analytically. 
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1. Introduction 

Feedback processes are present in dynamical systems that involve nonlinear relationships 
among many variables integrated over ti me. A formalism from electrical circuit theory 
[Bode, 1945] has been used to study feedback processes in climate models [e.g., Hansen , 
1984; Schlesinger, 1985] using the sensitivities of the system (first derivatives of one variable 
by another). This approach characterizes the change of the equilibrium state of the system 
after the introduction of an external forcing. 

Such an approach is valid in a theoretical model where the instantaneous sensitivities 
can be evaluated directly from the equations, however its application where the underlying 
equations are unknown (or only partially known) and all we have are observations of the 
system behavior over time is more questionable This is the situation faced in the study 
of climate. There are various ways to estimate these sensitivities in experimental studies. 
With numerical models, one approach is to inti xluce a perturbation of one variable at 
a time and then evaluate the impact on other variables. The problems associated to 
this approach are numerous. First, since the initial perturbation is limited to only one 
variable, the inter-dependence (i.e. , non-l in earity) of the sensitivities is not taken into 
account. Even if many variables are pert urb ed, it is difficult to be sure that the estimate 
of the multi-variate sensitivities is complete. Second, in most cases the sensitivities are 
estimated by finite differences between (usually equilibrium) states of the system. These 
differences can be generated by geographical location differences or by time differences, 
the result is dependent on the strategy adopted [Slingo et al, 2000]. As we will see, this 
simplistic approach can be highly misleading because it provides only the space and/or 
time averaged sensitivities that may not actually represent the system dynamics. Another 
problem with this approach is that the t ech nique estimates sensitivities that are already 
“polluted” by the feedback processes in the numerical model. For example, if the effect of 
a feedback is to damp the impact of one sensitivity, the sensitivity measured will be an 
under-estimate (the same problem happens for an amplifying feedback process). To avoid 
under or over-estimation of the sensitivities, their estimation needs to be done at a time 


3 


I 


scale sufficiently small to neglect the impact of the feedback processes. Moreover, such 
analyses can only be performed on models, not on observations of a real system, so the 
feedback estimates are only those what were introduced into the model, which does not 
provide model validation. Third, the sensitivities are often estimated by finite differences 
averaged over many geographical locations (e.g. global), which suppresses all the possible 
non-local processes. A similar effect is produced if the model outputs (or observations) are 
time averaged before analysis. 

So in many studies, the hypotheses at the base of the feedback analysis are too crude: 
linear model, constant sensitivities, mutual independence of sensitivities, mono- variable 
conception of the forcing and response. The conclusions from these studies are question? ble: 
a single feedback factor (i.e., a scalar) is supposed to explain the nonlinear multivaria'e 
processes integrated over time. If this number is positive, the process is said to have an 
amplifying effect on the initial perturbation. If the number is negative, the effect is a 
damping. We think that this over-simplification is misleading and that a time and spa:e 
analysis of feedbacks may be required to understand such complex phenomena as climate. 

Even if the sensitivities were correctly estimated, this classical approach only 
characterizes the change of the equilibrium state of the system after the introduction of 
an external forcing. The transient period, between the beginning of the forcing and the 
equilibrium is not described and the time needed to reach the new equilibrium remains 
undetermined. So the results of this kind of analysis are often insufficient even for 
comparison between numerical models and observations. 

We first refine the terminology required to perform feedback analysis with an emphasis 
on the discrete formulation of dynamical systems, which is better adapted to prediction, to 
the description of the cause, and effect relations underlying the feedback processes, and to 
the use of observations in the analysis. 

Our approach uses a statistical modeling of the dynamical system to infer, from the 
observed behavior of the system, the sensitivities of the variables of the system. These 
sensitivities are the key concept for the feedback analysis. They give the inter-dependences 
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of one variable on another that cause the feedback loops in a dynamical system. 

Because such empirical sensitivities provide the relationships among the variables they 
can be used more directly to understand the physical processes involved in the system 
dynamics (a model or observations) and to provide a more informative comparison of 
models to observation or one model to an oth er. A stability analysis of the dynamical 
processes involved can also be performed using these sensitivities. For prediction purposes, 
the temporal propagation of errors onto t he state of the system can also be analyzed 
[Smith, 1997]. 

As a test of the validity of this approach, we apply it to the output (observations) 
of the Lorenz low-order dynamical model where the equations are known and theoretical 
sensitivities can be calculated directly. We also evaluate classical feedback parameters and 
compare their usefulness as descriptors of the system dynamics to that of the instantaneous, 
multivariate and nonlinear sensitivities. 

2. Feedbacks in a dynamical system 

There are two general ways of formulating a dynamical system: the continuous and the 
discrete approaches. We prefer in this study the discrete formulation because it is simpler 
to describe the cause and effect relationships between variables. Furthermore, the discrete 
approach is more practical for prediction when no theoretical physical evolution model is 
available. We adopt the discrete formalism in the following, but will refer sometimes to 
the continuous case. The goal of this section is to show how time integration of dynamical 
relationships leads to feedback processes and to highlight the role of the sensitivities. 

a. Dynamical systems 

The object of this study is the analysis of a physical dynamical system by observing 
the time variations of the quantities defining the state of the system. A dynamical system 
is often described by a set of Ordinary Differential Equations (ODE) which come from the 
physics of the problem. For practical considerations or because these ODEs are not known, 
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the dynamical system is often discretized in the form: 

X(£ + l)=.A(P(t)) + e(f) (1) 

where X (t) is the p-dimensional vector of observable variables (defining the state of the 
system) at time t, P(t) is the d-dimensional vector of variables defining the system behavior 
(predictors) which can include X(t), e(t) is noise (instrumental or model errors), and A is 
a mapping, possibly nonlinear (vectors and matrices are indicated in bold characters). This 
kind of model is often used in atmospheric and oceanic sciences to perform, for example, 
climatological predictions. 

The determination of the good predictors P(t ) is a crucial point for the quality of the 
model. This determination uses all the physical a priori knowledge about the model. If 
P(t) = X(t) the system is said to be auto-regressive. Sometimes, the prediction of X(t 4- 1) 
requires the knowledge of X(t), X(t — 1 ),... ,X(t — q) because the system dynamics 
has an inertia that requires the knowledge of previous steps. Then, the system is said 
auto-regressive with memory q, denoted an AR(q) model. However, defining a new state 
variable X'(t ) = (X(t),X(t - 1), . . . ,X(t — q )), one can rewrite this AR(q) system as an 
AR(1) model with memory 1. 

If the dynamical system (1) is linearized near P(£ 0 ), we obtain: 

X (t 0 + At) - X(to) = G(P(t 0 )) ■ AP(t 0 ) + e(t) (2) 


where 


G(P(to)) = 


dX (t,Q At) 


v vu,y dP(t 0 ) 

is the Jacobian or sensitivity matrix of the mapping A at state P(£o)- 

The uncertainty e(t) is often neglected, so the discretized system of equation (2) is 
entirely defined by the sensitivities G(P(t )) of the dynamical system and by an initial state 
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b. General feedback analysis: sensitivities integration 

For simplicity of notation, we suppose, as is true in most of the cases, that the system 
is auto-regressive, i.e. the responses are equal to the predictors, X(t) = P{t) 

Even if the mapping A of equation (1) is linear, the global response of the 
dynamical system is not linear since the mapping is integrated in time. If in a system 
X(t + 1) = A - X(t) the matrix A is diagonal, the variables of the system are independent, 
so each variable X t evolves independently as A',- (to + k At) — (A^)* * A* (to). The absolute 
value is the damping (if |A»| < 1) or the amplifying (if \Au\ > 1) coefficient of the 
variable AV 

If matrix A is non-diagonal, i.e. some of the variables of the system are dependent 
on other variables, the system is more complex: an iiitial perturbation of one variable 
will be propagated into all the variables t hat a re directly or indirectly dependent on this 
initially perturbated variable. After k time s teps, the state of the system is given by: 
X(t 0 + kAt) = A k • X(t 0 ). So, the responses, X{t), at any time, f , are still a linear 
combination of the predictors at time, to, but the impact of an initial perturbation is more 
complex than the previous case because feedback loops have mixed the initial perturbation 

into each linked variable. dll 

If the mapping A of equation (1) is nonlinear, the Jacobians are dependent on the state 
X(t). So, even if we linearize the mapping A using its Jacobians, G{X(t)), after k time 
steps, the state of the system is given by: X{to -I- kAt) = Jnf=i G(X(to 4- l At)) ■ X (to) 
which is even more complex, where n is the product symbol. 

To define a feedback process in the dis cre te formulation of a dynamical system, we need 
at least two time steps to describe the feedback loops involved. If an initial perturbation 
AX(to) is introduced into the system at tim e, to, the response of the system at time, 
to + At, is approximated to first order by: 

AX(t 0 + At) ~G(t 0 ,t 0 + At)-AX(t 0 ), (4) 

where G(toAo + At), the gain of the system from to to to 4- At, is the Jacobian matrix 
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G{X{t)) of the mapping A between [t 0y t 0 + At]: matrix G{t 0 ,t 0 + At) has elements 

0 )^ ~ ^dxjtto at coor dinates An initial perturbation AA^(< 0 ) on variable X ; 
at time t 0 is then propagated at time t 0 + At to each variable A", that is linked to Xj via 
off-diagonal terms in G(t 0 , t 0 + At). But the resulting perturbations A Aj(t 0 + At) are the 
direct impact of the initial perturbation, so there is no feedback during [t 0 , t 0 + At]. 

At time, f = to + 2At, the impact on the system is given to first order by: 

AX(t 0 + 2At) 

~ G(t 0 + At, t 0 + 2 At) • AX (t 0 + At) (5) 

- G{t 0 + At, t 0 + 2 At) ■ G(t 0 , t 0 -I- At) • AX (t 0 ) (6) 

~ G{t 0 ,t 0 + 2At) ■ AX (t 0 ). (7) 

The previouly propagated perturbations, A Xi(t 0 + At), resulting from A Xj(t 0 ) can then 
perturb AA r j(t 0 + 2 At), completing a feedback loop. The initial perturbation A Xj(t 0 ) can 
be amplified or damped into AA’j^to + 2 At). We see in this simple example that feedback 
processes result from the time integration of the variable dependencies of the system. The 
term G(to,to -f- 2A t), representing the evolution of the system in two time steps, includes 
these feedback loops. 

c. Forcing / Response 

We introduce in this section the concept of external forcing to formalize the 
perturbations of the variables of the system we have discussed in the previous example. It 
is important to note that the feedback processes are present in a dynamical system, even 
when no external forcing is applied (forcing and feedback are often confused). 

An external forcing is a perturbation of the internal variables of the system (i.e. 
variables that define the state of the system). The external forcing has an impact on the 
internal variables, but the reverse is not true: the forcing is independent of the internal 
variables. There are many ways an external forcing could operate. The simplest model is 
the introduction of an unique perturbation at time t 0 \ E(t) = E 0 5(t 0 - t), a time-localized 
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volcanic eruption for example. In this case, the impulsive initial perturbation will be 
propagated in time through the internal variables, following the inter-dependencies of the 
variables. This is the example discussed in the previous section. 

The external forcing can also begin at time to an d remain constant in time: 

E(t) =E 0 ,te [t 0 , to + At, . . .]. In this case, the relations (5)/(7) become more complex: 


AX (t 0 + 2 At) 


— E(t 0 + 2At) + 


dX(t 0 + 2At) vu t A , dXjtp + 2 At)dX(t 0 + At) 
dX (t 0 + At) [0 J dX(t 0 + At) ax (to) 


~ Eq T G(t 0 + At, to + 2 At) • Eq + G(t 0 , to T 2 At) • E o 


E(t 0 ) 


( 8 ) 

(9) 


If the gains of the system are constant, equal to a constant G (i.e. a linear dynamical 
system), then at time t + k At: 


AX (t 0 + k At) = (l± G + G 2 + . . . + G k ) ■ E 0 


I - G k+1 

= 7- G 


• Eq 


( 10 ) 

(n) 


where I is the identity matrix. If the eigenvalues of matrix G have an absolute value lower 
than 1 (otherwise the system is unstable), the effect of the external forcing is stabilized, the 
dynamical system eventually reaches in time a stabilized state: 


AX (t 0 + k At) ~ j—~g ' E ° ’ f ° rfc + °° 

(12) 

G 

— Eq + j q ' Eq- 

(13) 


For example, a mono-variable system with G — 1/2, Eq — 1 and A (to) — 9 stabilizes at 
lim X(t 0 + k At) = 2, the forcing has changed the equilibrium state of the system. Figure 

fc-H-oc 

1 shows the values of the stabilized soluti on s of this simple system for different values of G. 
If -1 < G < 0 then the system stabilizes, but oscillates. If G is positive and close to 0, the 
system stabilizes near Eq. The closer the gain of the system G to 1 (i.e., lower but close 

to 1), the higher is the value at which it s tabi lizes, ^ lim^ X (t Q + A: At). If the absolute value 
of G is bigger than 1, the system is unstable. 
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d. One particular case: the classical analysis of a parallel feedback configuration 

The previous examples are very general since each variable of the system can be 
dependent on other variables. But in some cases, knowledge of cause and effect relationships 
provides a priori information about the ordering and the structure of dependencies. It is 
then possible, and recommended, to use this information. This kind of a priori information is 
used, for example, in the Cause and Effect Analysis technique [Andronova and Schlesinger, 
1991; Andronova and Schlesinger, 1992], 

In the classical feedback analysis [Hansen, 1984; Schlesinger, 1985], it is supposed 
that the external forcings E Xt of the system act on only one variable X e of the system, 
that all the other internal variables {A, = A*(A d )} are all dependent on one particular 
internal variable A , (i.e. , the diagnosed variable), that the impact of the external forcings 
is observed on this particular internal variable X d , and that the feedbacks act in parallel 
(Figure 2). These fssumptions are very strong cause and effect constraints: the diagnosed 
variable X d is supposed to be more important than the others internal variables {A',}, by 
hypothesis the {A';} are dependent only on X d , and they are not directly dependent on the 
external forcing. In this case, the feedback processes are assumed to act in parallel, i.e., 
they do not interact. 

Since the external forcing E Xc of the system acts on only one variable, A e , of the 
system, the general multi-dimensional expression in equation (8) becomes a scalar relation: 


AAe(f 0 + 2 At) ~ E Xc (to + 2 At) + + 

, v' SX e (tp + 2 At) <9At(to 4- At) 

^ 2 ^, 2 ^ n \ r . f +_ _l ay 


(14) 


i j dX,(t 0 + At) dXj(t 0 ) 

We measure the effect of the constant external forcing, E Xe , on X d , the diagnosed variable 
(Figure 2). We then analyze the system AAg — > AX d . So the perturbations AAi(f 0 + At) 
and AAj(to) are considered only for the diagnosed variable X d . In other words, the impact 
of the perturbations of other variables than X d are not taken into account in this classical 
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analysis (see Figure 2). Thus, the expression (14) becomes: 

AX e (t 0 + 2A t) ~ E Xe {t 0 + 2A t) + ■ A X d (t 0 + At) 

dXejtp + 2 At) dXj(to + At) . y ( f \ (i e\ 

j <9 A j (t 0 4- At) DA d (t 0 ) 

Due to the hierarchical dependencies adopted, AA(fo) - 4 AA(to 4- At) — )■ A r t (to + 2At), see 

Figure 2, some of the partial derivatives in (14) are zero: 

dXejtp + 2 At) = dXejto ± 2 At) = (16) 

dX d {tp 4- At) dA e (to + At) 

and the expression (15) simplifies to: 

s dX e (t.Q + 2 At) dXi(t 0 At) . . ,, 

AA' e(( „ + 2At) * Ex.to + m. +J^ aS5EW C 7) 

external forcing " — 1- — " 

feed! >ack terms 

?? ~ E Xe (to 4- 2At) 4- Y, H *(*<>, *o 4- 2At) AAr d (to) (18) 

i^d^e 

where the terms Hi(t 0 ,t 0 4- 2Af) are the products of first derivatives describing the cause 
and effect relations. Expression (??) can be multiplied by the gain G(t 0 + 2At,t 0 4- 3 At) of 
the system AAA (to 4" 2 At) — t AX d (to 4- 3At): 

AAA (to 4" 3At) ~ G(to 4" 2At, to 4- 3At) • AA e (fo 4- 2 At) 

~ G(to 4- 2At, to + 3At) • Ex e (to 4- 2At) 4- G(tq + 2Af , to 4- 3At) • ^ Hi(to, to 4- 2At) AAA(to) 

(19) 

If the system is in equilibrium, or if At, the time discretization, is sufficiently small, 

AAA (t 0 + 3 At) ~ AAA (t 0 ), thus: 

(1 + G(to 4- 2At, t 0 4- 2Af) • T H i (to,t 0 + 3At)]AX d {to + 3At)~ 

\ J 

G(t 0 + 2 At, t 0 4- 3At) • E Xc (to 4- 2At) (20) 

G(tp 4~ 2 At, tp 4- 3At) __T7 <t lOA+i /’oia 

^ AAA(t 0 + 3 At) ~ 1 _ ~ + 3 Ai ) Wo, k + 3At) A ' e( 0 

~ g(<o + 2At > t 0 + 3At) ^ (<o + 2At) (22) 

1 - Ei#d,i#e /i(to. <0 4- 3At) 
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where the terms f z (t 0 , t Q + 3 At) = G(t 0 + 2 At, t 0 + 3 At) ■ H t {t 0 , t 0 + 2 At) are called the 
feedback factors. The gain with feedbacks is then defined by: 


Gf(t 0 + 2At,t 0 + 3At) 


Gjtp + 2At, tp + 3At) 

1 fi(toi to “b 3 At) 


(23) 


The feedback /, factors are dependent on both the variable X e perturbed by the external 
forcing and the diagnosed variable, X d , chosen in the beginning of the analysis. These 
feedback factors are time-dependent, but this expression is traditionally [ Peixoto and 
Oort, 1991; Curry and Webster, 1998] given without time reference. This means that it is 
supposed that the system is in equilibrium or that the quantities are examined locally in 
time. 

Another way to find this expression, is to formulate this system as a “mono-variable” 
forced dynamical system AX d (t 0 ) AX d {t 0 + A t). The total gain of this system is defined 
as GH which represents the feedback loops plus the non-feedback gain, where: 


• G — with °ut feedbacks of the system AX t — ■» AX d 




and H = 


y' dX t (tp+AQ 

2^i?d,ij±e dX,(t 0 +At) dX d (t 0 ) 


, represents the feedbacks. 


The forcing of the variable X d is given by GEx c ■ In the limit of decreasing time steps, we 
could use the expression (12), to obtain: 


AX d = 


G 


1 -GH 


(24) 


This expression converges to the continuous case as At 0. In the original field 
where this formalism was developed, i.e., the analysis of electrical circuits [Bode, 1945], 
the relation (15) is instantaneous since the electricity propagates continuously. In this 
continuous case, the time reference in equation (15) can be suppressed. The same remark 
holds if the system is in equilibrium, i.e., if the previous perturbations are constant in time. 
Thus, this analysis has to be done locally in time or at equilibrium. 

The gain of the system, Gj , is very sensitive to the estimation of the factors, /,. 
Furthermore, it is very important to estimate all these factors simultaneously since the 
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effect of one particular feedback is sensitiv e t o the presence or or the absence of other 
feedbacks. 

Figure 3 shows the gain of the system, G/, as a function of the unique feedback factor, 

/, supposing that the gain of the system without feedback is G = 0.5. If / < 0, the gain 

with feedback is damped, 0 < Gj < G. If / = 0, the gain with feedback is unchanged, 

G f = G. If 0 < / < 1, the gain of the system with feedback is increased, Gj > G, and 

lim = +oo (the system becomes unstable). If / > 1, Gf is negativ', so the system oscillates, 
/-» i 

and it is unstable if Gf < — 1. We see in this figure how the effect of a feedback factor on 
the system can be sensitive and highly nonlinear. So the significance of a feedback factor is 
strongly dependent on the availability of the feedback factors of all variables: an isolated 
feedback factor can’t characterize the be havior of the whole system. 

e. One classical example 

The following example as been intensively used in the literature. We suppose that the 
global mean net radiation flux (solar minus terrestrial) at the top of atmosphere (TOA) is 
in equilibrium (XFtoa = 0). The question is: if a forcing is introduced into the system, 
how will the system react ? The global mean surface temperature Ts is often taken as 
diagnosed variable since a lot of other internal variables of the system are dependent on this 
variable. Then, we can analyze the feedback process loops acting on Ts using the above 
formalism if all feedbacks are assumed to act i n parallel. 

A forcing E Xext is introduced into an ex ternal variable, X ext (i.e. the solar insolation, 
volcanic eruptions, etc). We analyze the system: 

F TO A(t + M) = F(X eit (t),Xi(*),Ts(t)). (25) 

The terms X U are the internal variables of the system (i.e. that depend on the surface 
temperature, A', = X l {Ts )) like the albedo, the water vapor, the lapse rate, the clouds, etc. 

We suppose here that it is possible to express the external forcings, E Xtzt , in terms of 
perturbations of the net radiation flux, Etoa • The forcing introduces perturbations into 
the variables of the system; the link between these perturbations can be expressed by, see 
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equation (17): 


AW!. + 2 At) - Erodk + 2At ) ) + £ ' ~ Arifa) < 26 > 

external forcing s — ' 

feedback loops 

If the equilibrium state is reached, or if the sensitivities are instantaneous, the reference 
to time can be suppressed: 


£±Ftoa — Etoa + (£#,) A Ts (27) 

i 

where the terms H t are the products of first derivatives describing the cause and effect 
relations in equation (26). By multiplying this expression by the gain of the system without 
feedbacks, G = d ^ s , we finally obtain the following familiar expression: 

■ lTs = r~Gz,H, ETOA = r^j, Etoa (28) 

f. The classical analy; is in a series feedbacks configuration 

It is supposed again that the external forcings, E Xe , of the system acts on only one 
variable, X e , of the system. There are two diagnosed variables: X dl and X d2 , X d2 being 
dependent on X d i. Some of the internal variables {A^i = Aii(X dl )} are dependent on X du 
and some others {X i2 = Xi 2 (X d2 )} are dependent on X d2 . The impact of the external 
forcing is observed on diagnosed variable X d2 (Figure 4). This internal structure describes 
a dynamical system X e —> X d i — > X d2y with feedbacks in series. In this case, the gain of 
the subsystems X e -> X dl and X d i -> X d2 would be computed as in section 2.d. Then, the 
global gain of the system would be Gj = Gj 2 G^. 


g. Comments on classical feedback analysis 

We have seen in the two previous subsections that where particular cause and effect 
relations in the system are known, the time reference is required in the discrete case, but 
can be suppressed in two situations: 

• In an equilibrium state: the perturbations are stabilized ^ = 0 (not to be confused 
with zero forcing), so they are the same at each time step. The feedback analysis is 
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then only a characterization of the equilibrium state. There is no estimation of the 
time required to reach the equilibrium and we can’t predict transient behavior of the 
system. Furthermore, we don’t know a priori the sensitivities in the equilibrium state, 
so we are required to assume (without evidence) that the sensitivities are constant 
and that we have a good estimate of t he m. 

• When the sensitivities are instantan eou s: the relations between the perturbations of 
each variable of the system are then valid without a time reference. But in this case, 
instantaneous estimates of the sensiti vi ties are required and the feedbacks factors 
have to be computed at each time. To our knowledge, this approach has not yet 
been investigated since no technique was available to estimate these instantaneous, 
multivariate and nonlinear sensitiviti es. 

The classical approach to feedback analysis from the electrical circuits theory [Bode, 1945] 
weis first used on a theoretical energy balance model of the climate where instantaneous 
sensitivities are available. Even if the estimatio n of sensitivities was crude, the applicability 
of the technique was justified when the ca us e and effect relationships were supposed to 
be known. In more recent studies, and particularly with the analysis of observations, this 
approach to the estimation of sensitivities is h ighly questionable. In particular, the use of 
this characterization of the equilibrium state to predict the system response to an external 
forcing is difficult since the sensitivities used to produce the equilibrium state are unknown. 

Some of the limitations of actual studi es are: 

- Model used: The hierarchical model of c a use and effect relations, described by greatly 
simplified relations between sensitivities, is usually much too simple. For example, the 
fact that the forcing/gain/response system has to be mono-variable is a very strong 
simplification: such assumptions result in the suppression /neglect of some perturbations 
and some first derivatives in the system. 

- Estimation of sensitivities: The sensitivities are often estimated by finite difference 
between two (usually equilibrium) states of the system. First, this approach measures only 
the coincidence of the changes in two quantities, but it does not mean that there is a cause 
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and effect relationship between .these variables. The relationships might also be indirect 
(via the ordering of the dependencies). Second, this approach measures the changes in two 
quantities and the sensitivity is then estimated assuming that the other variables do not 
interact. This is a strong limitation since there are a lot of cross-linkages in the variables of 
the climate system. Third, the finite difference for the estimation of the sensitivities can be 
highly misleading if the sensitivities of the system are not constant in time. 

- Forcing process: The forcing model is often not expressed: localized in time, constant, 
growing in time, cyclic, etc ? The w r ays the external forcing evolves in time are also 
important for the study of the transient response. 

- Better description: Previous approaches to feedback analysis are often only a 
characterization of the equilibrium state of the system after the introduction of an external 
forcing. The transient period between the beginning of the forcing and the equilibrium 
state is not described, the time to reach the equilibrium is not estimated. This is a real 
drawback for the understanding of these phenomena. Furthermore, the gain of the system 
with feedback factors is highly dependent on the precision of the sensitivity estimates. 

In conclusion, the classical feedback analysis is limited by some very strong assumptions 
like linearity (i.e., sensitivities constant in time), equilibrium, mono-variable cause and 
effect relationships, etc, and so does not seem at all appropriate for application to the 
climate system. Moreover, the resulting expressions for the feedback factors are products 
of the instantaneous sensitivities, so it would seem .more straightforward to evaluate these 
sensitivities instead. To avoid the classical limitations, the analysis needs to employ a 
general feedback formulation to evaluate the nonlinear, multivariate and instantaneous 
sensitivities in both numerical models and observations is important. 

3. A nonlinear regression scheme for estimation of sensitivities 

To estimate the sensitivities of the dynamical model in equation (1), we use a 
multivariate nonlinear regression fit to the statistics produced by observing the behavior of 
the system over a time period long enough to provide a good sample of the different states of 
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the system. For this purpose, we introduce a neural network technique because of its ability 
to process large dimension data (which will be helpful for further experiments on numerical 
models) and its capacity to integrate a priori information about the problem [Aires, 1999]. 
Any other multivariate nonlinear regression technique, such as spline interpolation or 
ARMAX models, etc, could be used instead of the neural network technique. 

a. The neural network model 

The Multi-Layer Perceptron (MLP) network is a mapping model composed of parallel 
processors called “neurons”. These processors are organized in distinct layers: the first layer 
(number 0) represents the input P = (p, ; 0 < i < m 0 ) with m 0 the number of neurons in 

la>er 0. The last layer (number L ) represents the output mapping X = (x k ; 0 < k < mi). 
Th? intermediate layers (0 < m < L) are called the “hidden layers”. These layers are 
connected via neuronal links (Figure 5): two neurons, i and j, between two consecutive 
layers have synaptic connections associated w ith a synaptic weight Wi j. 

Each neuron, j , executes two simple operations: first, it makes a weighted sum of its 
inputs from the previous layer, Z{\ this signal is called the activity of the neuron: 

a j = 52 Wij • Zi • ( 29 ) 

ielnputs(j) 

Then, it transfers this signal to its output through a so-called “transfer function”, often a 

sigmoidal function such as a (a) = tanh(a). The output z 3 of neuron j in the hidden layer is 

then given by: z 3 = a i £ vo tj zi ) . Generally, for regression problems, the neurons 

Velnputs(j) / 

in the output (last) layer have no transfer function. For example, in a one hidden layer 
MLP (Figure 5), the k^ 1 output, x k , of the net work is defined as: 

*k(y) = 52 w jk ° ( a i) = 52 w jk ° 52 w v Pi ) ( 3 °) 

jeSi jesi \t€S 0 / 

where a is the sigmoidal function, a 3 is the activity of neuron j and S t is the layer of 
tht network (with i = 0 for the input layer). We have deliberately omitted the usual bias 
term in this formula for clarity, but include it in the actual network. 
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The key to our analysis is that any continuous function can be represented by a 
one-hidden layer MLP with this kind of sigmoid function [Hornik et al, 1989; Cybenko, 
1989]. Hence the process of training the MLP to fit the observed multi- variate, nonlinear 
relationship statistics is equivalent to deriving a multi-variate, nonlinear function that 
behaves in a similar fashion as the dynamic system in question. The key advantage of the 
neural network approach over some other methods is that the Jacobians (i.e., sensitivities) 
can be evaluated directly from the MLP (see next section). 

b. The learning algorithm 

Given a neural architecture (number c f layers, neurons and connections, type of transfer 
functions), all the information of the network is contained in the set of synaptic weights w iy 
The learning algorithm is an optimization technique that estimates the network parameters 
W ~ {icjj} by minimizing a loss function, C{W), needed to fit the desired function defined 
by observations as closely as possible. The criterion usually used to adjust W is the mean 
square error in network outputs: 

1 mL r r 

C(W) = 2 E / J M p ’ - h) 2 H(t t /P)B(P)dt„dP (31) 

with t k the k th desired output component, x k the k th neural output component, H(t k /P) 
the probability function of output t k given the input P, and H(P ) the probability density 
function of input data, P. If specific a priori information about the probability distribution 
functions is available, other quality criteria than least-squares could be used. For example, 
criteria involving higher-order statistics have been defined [Aires et al, 2000]. Practically, 
C(W) is approximated by the classical least square criterion: 

^(W) = ^Z(x t (P-,W)-t t f (32) 

zrj e=l 

The Error Back-Propagation algorithm [Rumelhart et al, 1986] is used to minimize 
C’(IT). It is a stochastic steepest descent (i.e. Newtonian minimization procedure) very well 
adapted to the MLP hierarchical architecture because the computational cost is linearly 
related to the number of parameters. 
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c. The neural Jacobians 


The important feature of neural network is that the adjoint model of the neuronal 
model is directly available [Aires et al, 1999]. The computation of this adjoint model (or 
neural Jacobians) is accurate and very fast. Since the neural network is nonlinear, these 
Jacobians are dependent on the situation x For example, the neural Jacobians in the 
previous example of equation (30) (a MLP network with one hidden layer) are: 

P- = E "Stir ( £ w <i0‘ ) w 'i- < 33 > 

dp, da j 

where I 2 is the derivative of the transfer function a. For a more co nplex MLP network 
with many hidden layers, there still exists a back-propagation algoi ithm for efficiently 
computing these neural Jacobians. 

The neural Jacobians concept is a very powerful tool because it allows for the direct 
statistical evaluation of the multivariate a nd no nlinear sensitivities ol the dynamical system 
under study. 

d. Regularization 

If a priori information about the dynamical model under study is available, it is 
possible and recommended to use this kn owle dge in the neural network analysis. This a 
priori information could be introduced in the t hree distinct components of a neural network: 

- Dataset: The quality and the representativeness of the dataset used for the training 
of the neural network is directly responsible for the quality and the generality of the 
nonlinear regression obtained. We will comment further on this during the construction of 
the dataset for our application. 

- Architecture: A lot of informati on c ould be used to define the neural network 
architecture: ordering of variables, neighborhood system between variables, dependencies 
structure, etc. A particularly promising development would be to use the ordering of 
dependencies discussed in Section 2d to d efine the neuronal links of the neural model. 
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- Training: If the sensitivity between an input and an output variable is already 
known or if it is known that this sensitivity is constant, it is possible to specify this a 
priori information as a penalty term added to the quality criterion (32) used to train the 
neural network. Other kinds of solution constraints can also be used: shape of the solution 
distribution, noise in the measurements, particular dependencies between variables. Such 
an approach has been used, for example, in [Aires et al, 1999] in the atmospheric radiative 
transfer field. 

4. Analysis of the discrete Lorenz model 

To test the definitions and the technique previously presented, we apply it to a simple 
nonlinear, multivariate, chaotic, non-stationary and forced dynamical model for which the 
sensitivities are known analytically. We choose here a discrete form of the low-order Lorenz 
model [ Lorenz , 1984]. This model is very general since it is not a mono-variable structure, 
as described in Sections 2d and 2f, and it exhibits very complex behavior. Nonetheless, w T e 
can define the time relationships directly from the equations of the model to test our ability 
to infer these relationships from the observed behavior (model output). 

We have discretized the Lorenz continuous model for two reasons: first, the discrete 
formulation makes it easier to describe the cause and effect relations of the feedback 
processes. Second, we know exactly, in this case, the analytical sensitivities of the system, 
which allows for a better quantitative evaluation of our analysis technique. 

a. Continuous Lorenz model 

The low-order model used in this study was developed by Lorenz [ Lorenz , 1984; Lorenz, 
1990] to analyze the chaos and stability assumptions about the atmospheric circulation. 
This simple model is able to represent the Hadley circulation and is used to determine the 
stability or the instability of this circulation (stationary or migratory disturbance). This 
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model is defined by three Ordinary Differential Equations (ODE): 

' = -Y 2 {t) - Z 2 (t) - a X{t) + a F\ 

< ^ = X(t)Y(t)-tX{t)Z{t)-Y(t) + F 2 ( 34 ) 

= bX(t)Y(t) + X(t)Z(t)-Z(t) 

where: 

• t is the time (in units of about 1 day), 

• X is the intensity of the symmetric globe-encircling westerly wind current and also 
the poleward temperature gradient (assumed to be in permanent equilibrium with it), 

• Y is the cosine phase of a chain of superposed large-scale eddies, which transport heat 
poleward at a ate proportional to the square of their amplitudes, 

• Z is the sine p lase of a chain of superposed large-scale eddies, which transport heat 
poleward at a ate proportional to the square of their amplitudes, 

• Fi is a zonally symmetric thermal forcing on X, 

• F 2 is a zonally asymmetric thermal forcing on Y . 

The two forcings F\ and F 2 are the values to which X and Y would be driven if the westerly 
current and the eddies were not coupled. 

The discretization of these ODEs is a ve ry delicate process, but the Runge-Kutta 
fourth-order technique can be used for that purpose. Figure 6 shows the integration 
of 34 from to = 0 to T = to + NAt using: a = 0.25, b = 4, F\ = 8, F 2 — 1 and 
At = 0.08. The initial state of the system at time t = 0 is taken as: A r (0) = 1.312465072, 
y(0) = 1.486416698 and Z{ 0) = 0.3487878144. Lorenz has shown that this system with 
these parameter values has a chaotic beha vio r. 
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c. Discretization of the dynamical system 

We are not interested in a perfect simulation of the Lorenz model; rather, we are 
interested in a representation of this system in a form like 


( X(t+ 1) ^ 


( X(t) y 

Y(t + 1) 

= A 

Y(t) 

V Z(t + 1) J 


K Z(t) J 


as a test of our analysis technique. 

By discretizing (34), we obtain: 

X(t+1) = At [-Y[t) 2 - Z{tf + a F x ] +(1 — a At) X(t) 

< Y{t + 1) = At \-b X(t) Z{t) + F 2 ] +(1 - At + At X(t)) Y(t) (36) 

Z{t +1) = A tb X{t) Y(t) +(1 -(- A t X{t) - At) Z{t) 

where At is the discrete time step. 

The size of the time step At needs to be sufficiently small so that the linearization of 
the system during a time interval is accurate in order that means that the hypothesis that 
the Jacobians of the system are constant during the time interval is true. 

The time discretization is also directly related to the regularity of the Jacobians of 
the system: high complexity requires small time steps to ensure a good description of the 
evolution of the Jacobians. We take At = 0.08. 

The differences between the fourth-order Runge-Kutta integration of (34) and the 
discrete Lorenz model 36 are shown in the Figure 7. We see that these differences are small 
at the beginning of the period, but that the amplification of these little difference becomes 
important over time since the system is chaotic. The behavior of the continuous and the 
discrete systems seems to be the same, in particular the same amplitudes for the minima 
and maxima are observed. All that is required to test our analysis is that the discrete model 
exhibit complex, nonlinear behavior: we take the discrete model to be the truth and test 
whether we can infer the correct the relationships with our neural network technique. 
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d. Sensitivities of the dynamical system 


The Jacobian matrix of the discrete system is: 


G 


1 X{t+ 1) N 
r(t + i) 
\Z(t+l) ) 



ax{£+ 1) 

dX{t+ 1) 

ax(£-fi) 

\ 


dx(t) 

0K(O 

dz(t) 




ar(t+i) 

av(t+i) 



8X (t) 

8Y{t) 

dZ{t) 



dZ(t+ 1) 

dZ{t+ 1) 

ez(t+ 1 ) 

) 

aA'(t) 

ay(t) 

dZ(t) 


( 


( 37 ) 


1 - a At -2 At Y(t) -2 At Z(t ) 

-Atb Z{t) + AtY{t) \- At + At X{t) -b At X(t) (38) 

\ At b Y{t) + At Z{t) At b X{t) l + AtX(t)-Atj 

These Jacobians are dependent on the state of the system, so they are also dependent on 

time and the hypothesis of constant Jacobi an s, as in classical feedback an '.lysis, can not be 
used to understand this system. 


e. Theoretical feedback analysis 


The two external forcing, aF x on X , and F 2 on Y, are continuous and constant in (34). 
In the discrete formalization, this is simulated by a constant impulsive forcing: 


AX(t 0 + kAt ) = At a F } 

< for k = 0, . . . , N 

AY (to *b k At ) = At F 2 

If the beginning state of the simulation is chosen as: 

' X(t 0 ) = 0 
< Y(t 0 ) =0 
k Z(t 0 ) = 0 

then, the state of the system at the next time step is given by: 

X (to d" At') — At a F\ 

* Y (to ■+■ At) = At F 2 

Z(to *b At) = 0 


(39) 


(40) 


(41) 
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and, for the next time step: 


A (to + 2 At) — 2 A t cl Fi — A t 3 F 2 2 — A t 2 cF F\ 

* V (to + 2 At) = 2 At F 2 — At 2 F 2 "h At 3 cl F\ F 2 (42) 

Z(t 0 + 2 At) = At 3 a b Fi F 2 

and so on. We analyze the impacts of the external forcings, a F\ and F 2 , in the diagnosed 
variable, chosen here to be A. 

The perturbation at time t 0 4- At, As\(t 0 4- At) = At a F u is straightforward. At time 
t 0 4- 2 At, without feedbacks, the forcing would simply be added: 

A A (t 0 + 2 At) = 2 At a Fj (43) 


With feedbacks, the true perturbation is given by: 


AX (t 0 4- 2 At) — E\ (to 4- 2 At) 4* 


dX (to 4" 2 At) 
dX (to 4" At) 


Ex (to 4- At) 


+ 


dX (to 4" 2 At) 
dy (to + At) 


Ey(to 4" At) 


= 2A taF 1 - A t 3 F 2 2 - Af 2 a 2 Fj 


(44) 


Comparing expression (43) and expression (44), we note the presence of two correction 
factors giving the contribution of the feedback processes: so far there are feedbacks caused 
only by the integration of the variables over time. This expression for the perturbation is in 
agreement with first equation in (42). 

For the description of the indirect feedbacks, three time steps are required. At time 
t 0 4- 3 At, the integration of the external forcings is even more complex: 


AX (to 4 - 3 At) = (to + 3 aq 4 - 

external forcing 


0X(t o + 3At)« , ^ , dX(t Q +3 At) „ , 

&x (*o + 2 At) + — — ■ Ey (*o + 2 At) 


ax(t 0 + 2 At) 


ar(t 0 + 2 At) 


direct feedbacks 


, fdX(i 0 + 3 At) dX (to -f 2 At) dX(t 0 + 3 At) dY(t 0 +2 At) dX(t 0 + 3 At) dZ(t Q +2 At) l B + 

|aX(t 0 + 2 At) OX(t Q + At) + $y(t 0 +2At) &X(to + At) a2(t 0 +2 At) 0*(<O + At) J * ° 

V v ' 

indirect feedbacks 

, rax(t 0 +3 At) dX(t 0 +2 At) dX(t 0 + 3 At) 9Y(t 0 + 2 At) dX(t 0 4- 3 At) dZjip + 2 At) 1 + 

^ [ax(t 0 + 2 At) fly(t 0 + At) + «?y(t 0 + 2At) ay(t 0 + At) + s2(t 0 + 2At) ay(t 0 + At)J y 0 

^ v* ' 1 1 ^ 

indirect feedbacks 


(45) 
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We note in this expression some terms that do not appear in the classical analysis formalism. 
For example, the direct feedbacks terms (due to time integration of the variables) are 
suppressed in the classical analysis. Furthermore, we see that in this expression both 
forcings (on variable A' and on variable Y) are taken into account, which is not possible in 
the classical approache. 

Integrating the system for one more time step would be highly complex, this is the 
reason why prediction of this kind of dynamical system is a difficult problem. To perform 
prediction, the model needs to represent the sensitivities with a high degree of precision. 
Otherwise, an error at ore time step is rapidly amplified in the next time steps. 

The classical formalism for the feedbac k analysis is not well adapted to the analysis of 
the Lorenz system since here is no preferred variable on which the other two variables of 
the model depends. So we see in this simple example how limited the assumptions used in 
the classical feedback analysis formalism ar e a nd how such an analysis could be misleading. 
Again, it is clear that evaluation of the sen siti vities is more straightforward than evaluation 
of feedback factors, which are products of sensitivities. However, for illustrative purpose 
we will also use the classical formalism to calculate feedback factors because they are more 
familiar. If we choose the variable Y as .he variable affected by the external forcing and A 
as the diagnosed variable, the gain of the syste m E y -> AX is given by, see equation (24): 


AA = 


H 


1-GH 


Ey — Gf Ey 


G 


i-x ifi 


XY 


Ey 


(46) 

(47) 


where: 


• q — §2L i s the gain without feedbacks of the system Ey AX, 

j TT &Y i 

• and H = w + XiexlTx- 

The three feedbacks factors for this mono-variable system are defined as: 

tYX dX dY 
* x ~dYdX 
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(49) 


YX _ 9X dY OY 
Iy dY dY dX 
fYX _dXdYdZ 

“ dY dZ dX ( 5 °) 

Note that the sensitivities used in this relation still are dependent on time and have to be 

estimated precisely, so that the feedback factors are also time dependent. As we will show, 

this fundamental property of complex, nonlinear dynamical systems reduces the value of 

the classical (linear) feedback analysis for understanding the system behavior. Note again 

that the above quantities are not the true feedback factors for the Lorenz model since they 

are defined using invalid assumptions. 

5. Experimental results 

a. Construction of the dataset 

The quality of the dataset used to evaluate the sensitivities is a crucial point. For 
example, using data from a system in equilibrium or from a system during a transient 
change will not give the same results in the analysis. Ideally, a good dataset would be one 
including all ranges of variability for all combinations of the variables of the system. The 
more situations that are included in the dataset, the larger wall be the range of validity of 
the sensitivity estimates. This situation parallels that in climate analysis where the range 
of validity is limited by the range of climate states actually observed. 

The discrete dynamical version of the Lorenz model stabilizes more rapidly onto its 
attractor than the continuous version. So to create a dataset closer to the behavior of the 
continuous system, we choose 200 noisy states of the continuous system as initial states 
for 200 trajectories of 1000 times steps of the discrete system in equation (36). The final 
dataset is then composed of N = 200,000 couples {(/*,(>*) ; k = where 

I = (X (to + k A t) , Y (to -i~ k At), Z(t 0 k At)) is an JVx3 matrix of the inputs of the 
system and O k = (A (to + {k + 1) At),y(to + {k + 1) At),Z(to + (k + 1) At)) is an N x 3 
matrix of the outputs. Each couple is linked by: O k = A(I k ). 

The parameters for the Lorenz model are the same as previously: a = 0.25, 6 = 4, 
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F\ = 8, F 2 = 1 and At = 0.08, but we have introduced a Gaussian noise Af{0, 0.001) at each 
time step and in each variable. Figure 8 shows the resulting noisy trajectories included in 
the dataset. 


b. Linear and nonlinear regressions™ 

If a priori information is available about what are the good predictors, the dynamical 
system can be described as a linear model. In the Lorenz case, the good predictors, P{t), 
of the general model (1) can be determined from the theoretical model (36): 

P(t) = (X(t), Y(t), Z(t),r 2 (t), Z 2 (t), X(t)Y(t),X(t)Z(t\ Fi , F 2 ). (51) 


In this configuration, the dynamical system of equation (36) becomes: 


1 X(t+ 1) ' 


Y(t+ 1) 

V Z(< + 1) j 

■where the constant matrix A is given by: 


= A ■ {X(t),Y(t),Z(t),Y 2 (t),Z 2 (t),X(t)Y(t),X(t)Z{t),F u F 2 ) (52) 


( 


A = 


V 


-a At 0 0 

0 1A t 0 

0 0 1 - At 


-At -At 0 
0 0 -At 

0 0 b At 


0 a At 0 ^ 
-b At 0 At 
At 0 0 j 


(53) 


A linear regression in this case would give a good estimate of the matrix A. This is a very 
general idea: all nonlinear dynamical systems could be simplified, and even linearized, if all 
of the good predictors are known. 

In practise, this a priori information is not available, so choosing the good variables 
to predict system behavior is a key issue that has no general answer. Usually, then, the 
predictors are chosen as the state variables; model (1) becomes: 


1 X(f+1) ' 


f X(t) > 

Y(t+ 1) 

= A 

Y(t) 

^ Z(t + 1) j 


k m ) 


(54) 
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Now, a linear regression analysis approximates the nonlinear function A by a linear 
model: A is replaced in (54) by a 3 x 3 matrix A. This matrix is estimated by minimizing 
the least squares criterion and is given by: 


A = 


T 1 

3 • o 


(55) 


The use of this linear regression is already an improvement compared to classical 


approaches because it allows the simultaneous estimation of multivariate sensitivities. 

For a nonlinear regression, we use an MLT network with one hidden layer. The 
architecture has three units in the input layer coding I = (X(t),Y(t), Z(t)), 30 units in the 
hidden layer (this number was chosen by trial in the training phase) and three units in the 
output layer coding the prediction, O = ( X(t + 1), Y(t + l),Z(t + 1)). 

For the training of the neural network (i.e. estimation of the parameters for the 
nonlinear regression), we have used 150,000 points randomly chosen from the data set 
previously constructed and for the test data (i.e. to measure the ability of the model to 
generalize to unknown data) we have taken the remaining 50,000 points. 

In Figure 9, the theoretical (points) function A and its two estimates (by linear and 
neural network regressions) are illustrated. For display purpose, each plot represents one of 
the variables at time t + 1, as a function of a variable at time t, supposing that the two 
other variables are equal to their mean values. It is clear that the neural network regression 
is very precise (differences with the theoretical function are undetectable) and useful for the 
nonlinear behavior (X(t + 1) as a function of T(t), for example), where the linear regression 
is very poor. This figure shows how important the nonlinear aspect is: the multivariate 
approach of the linear regression is not sufficient. These conclusions are confirmed in Figure 
10 where the RMS error for the estimation of the functions is given. Here the errors of the 
linear regression are nearly as large as the variability of the quantities. 

A dilemma that we will be faced with in applying this technique to a real case, 
numerical model or observations of the climate, is that we do not know the true answer 
as we do here for the Lorentz model. Hence, we must develop practical ways to assess the 
fidelity of the analysis results. One possibility is to conduct “prediction” experiments where 
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we pick many specific and different episodes in the observed record (preferably time periods 
not included in the original analysis), initialize the neural network at the beginning state, 
and calculate forward for a short time interv al. The goal of such experiments is diagnostic, 
to test quantitatively whether the derived sensitivities used in the neural network can 
reproduce the observed system dynamics in cases not included in the original analysis. It 
is not our goal to propose that such a neu r al n etwork be used for climate forecasts (i.e, be 
used as a statistical forecast model) in plac e o f a physical model of the climate (see [ Yuval, 
1999] for a previous study on this subject). Rather, we are interested in whether the derived 
sensitivities can be used to understand the physical processes; at least the sensitivities of a 
model can be compared with observations. We have tested this idea by making prediction 
runs with our neural network representatio .'Lof the Lorentz model: the calculation proceeds 
by calculating the state of the system at time step, t + 1, from the state and sensiti\ities 
of the system at time, t\ the sensitivities are then calculated at time, t + 1, and used in 
the next cycle. Figure 11 shows the evolution of the rms error of the predictions based 
on the linear and our nonlinear statistical mo dels against the actual model started at the 
same state (each time step = 0.08 units, about 2 hr in the scaling of the equations). As 
expected, the nonlinear regression by the neural network does much better than the linear 
regression, but the fact that the Lorentz system is chaotic (with the particular parameter 
values used) results in a relatively rapid increase of prediction error, even with an accurate 
approximation of the system dynamics. Figure 12 illustrates the time records from the 
prediction model and the actual model. 

c. Analysis of sensitivities 

We illustrate the retrieval of the vari a ble sensitivities in the form of histograms of 
their distribution of values as a functions of X (t). Similar figures (not shown) are obtained 
as function of Y{t) or Z(t). The standard-deviation of the theoretical sensitivities of the 
system are shown in Figure 13, indicating that the all sensitivities of the system, except for 

Up ’ are not constant - 
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The classical approach for the estimation of sensitivities takes the finite difference in 
two variables between two (usually equilibrium) states of the system or two extreme events. 
For example, for the estimation of ^ 7 , two sets of extreme events of the variable Y could be 
selected in the observations and the averages of the state differences < AX > and < AY > 


estimated. Then, the following approximation would be used: 

dY < AX > 

ax W ~ < A Y>' 


(56) 


We see ho\v this approach can go wrong because it is so dependent on the selection of 
data: at best, it gives a crude estimate of the mean sensitivity for the selected dataset of 
extremes. The results of this approach for the Lorenz model would be very poor. 

The particular sensitivity ^ is the only one that is constant, i.e. does not depend 
on the state of the system: in equation (38), = 1 - a At (the values in Figure 13 are 

not perfectly equal to zero due to numerical imprecision). The linear regression, tor this 
particular sensitivity, is then a good estimation technique. So the results are good in this 
particular case, but for the eight other sensitivities, the results of the linear regression are 
insufficient. In a real world case, we would not know which results are correct, if any. 

The neural network-based estimates of the sensitivities (Figure 14) are a considerable 
improvement in comparison to the linear regression-based ones (except for the constant 
sensitivity at extreme values of X '(f), but the differences are still negligible). Note 

that the magnitudes of the sensitivities are very different, yet our technique seems to be 
able to retrieve these different orders of sensitivity in the system. Furthermore, these results 
are good if we compare the rms errors with the natural variability described in Figure 
13. These results are summarized in Table 1 . Since linear regression-based sensitivities 
are constant by assumption, the rms errors of this representation are essential equal to 
the standard deviations of the sensitivities. The improvement of the neural network-based 
sensitivities is considerable with respect to the linear regression: standard-deviations errors 
is always (except for the constant sensitivity) smaller than the natural standard-deviation of 
the theoretical sensitivities by one and somtimes two orders of magnitude. Given the large 
range of sensitivity magnitude, it is notable that the RMS errors of the neural network are 
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uniformly distributed over the nine sensitivities, even if the variability of the sensitivities is 
quite different. Table 1 summarizes the improvement gained by use of the neural network 
Jacobians to estimate the instantaneous, multivariate and nonlinear sensitivities of the 
discrete Lorenz dynamical system. 

Figure 16 shows an example of the evolution in time of the theoretical and neural 
network estimates of sensitivities. The diffe r ences between the theoretical sensitivities 
and the neural network-based estimates are undetectable in this figure. This figure also 
highlights the more complex role of the feedbacks processes: when the state of the system 
reaches some extreme value, the sensitivities change, even in their sign, taking to the system 
back towards a middle range of values and f i nally to stabilize the system on its attractor. 

For example, using the theoretical sensitivities in equation (38), we can analyze the 
relation between the variables X and Y. If Y is large and positive, then the sensitivity 
= -2 At Y(t) becomes large and negative. So, if Y continues to increase, the 
variable X will decrease even more rapidly. But the auto-sensitivity (the most 

important sensitivity for the variable Y) is equal to 1 — At + At X(t ), which will be lower 
than 1 (damping effect) when X is lower t ha n 1. One consequence of this behavior is that 
particular sensitivities, even w r hen they ar e sm all on average, can still have a strong impact 
on the behavior of the system. A linear regression analysis assuming that the sensitivities 
are constant in time, may provide some estimate of mean sensitivities from a dataset. For 
example, the sensitivity dX d z(i ' p is, on average, nearly zero. A linear analysis, in this case, 
might suggest neglecting this relationship in understanding the system. Figure 15 shows 
how wrong this approximation would be: t h is figure represents the discrete Lorenz model 
defined in equation (36) with and without this particular sensitivity. The two trajectories 
have a quite distinct behavior: the simula ti on without the sensitivity oscillates more 
strongly and with a different time scale. The behavior of the complete system is produced 
by oscillations of the particular sensitivity, depending on the state of the system, between a 
positive and a negative value, theby stabilizing the system dynamics. 

These results are special features of the general tendency of the sensitivities to exhibit 
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similar shapes in their time records (Figure 16), which means that, they are closely linked 
with each other. This type of nonlinear behavior prevents a linear, even multi-variate, 
regression analysis from extracting even approximate information about the system 
dynamics. Comprehension of the system seems to require a more accurate representation of 
the time evolution of the multi-variate sensitivities. 

d. Feedback analysis 

We have seen that the classical approach for the feedback analysis, which makes strong 
(and incorrect) hypotheses about the dynamical system, is not w r ell adapted to the Lorenz 
model. 

However, the feedback factors can still be computed for the theoretical function, the 
linear regression model and the neural network model according to equations (48)/ (50) . We 
suppose here that these expressions are applicable to show r that these feedback factors evolve 
in time (Figure 17), in violation of one of the assumptions used to describe the expressions. 
The feedback factors (48) / (50) are not simple and do not improve our understanding of 
the system since their physical interpretation is confused since these feedback factors are 
products of the sensitivities. The sensitivities, themselves, seem to be the more fundamental 
quantities. Furthermore, as we showed in Section 2, without all the assumptions at the base 
of this formalism (linearity, constant sensitivities, hierarchical cause and effect relationships, 
constant forcing, equilibrium state, etc), the whole formulation in terms of feedback factors 
falls apart. 

6. Concluding remarks 

What we have learned with this study of the Lorenz model is that the feedback 
processes are dependent on some important particular properties of the dynamical system 
under study. First, the feedback processes appear in a dynamical system when multivariate 
sensitivities are integrated over time. Second, if the system is nonlinear (i.e. the dynamical 
operator in equation (1) is nonlinear), the sensitivities are not constant with time, which 
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means that the feedback processes evolve in time. Third, each feedback has a strong 
impact on the character and behavior of the dynamical system, even those that may have 
a small time-averaged magnitude can have a stabilizing effect that changes drastically the 
characteristics of the system. Without such feedbacks the dynamical system would have a 
tendency to destabilize when an external for ci ng is introduced. The feedback processes have 
a stabilization effect, so the system does n ot d iverge too much from its initial equilibrium. 
But this new equilibrium state could be d iffer ent, with for example a higher frequency of 
extreme events. This is a theory that has been discussed recently by [Palmer, 1999]. Such 
an effect might explain the increase of th e frequency of ENSO events with an increase of 

co 2 . 

We have shown that the classical technique to analyze climatological feedback processes, 
from the electrical circuit theory, is, by hypothes is, very limited in its validity when applied 
to highly nonlinear multi- variate systems. Its applicability to the climate problem is even 
more questionable. Furthermore, the results of his kind of classical analysis are no more 
than a “schematic” measure of feedback processes at equilibrium of the system, which may 
be very misleading. 

In comparison, the multivariate, instantaneous and nonlinear sensitivity concept, is 
more generally applicable without these constraints, and appears to be a good way of 
understanding the behavior of a system with coupled feedback processes. This general 
technique allows the quantification of these processes both spatially and temporally. This 
dynamical information seems to be mor e usefu l than classical feedback factor (only one 
number per variable). 

Our technique for statistically inferring the complex network of sensitivities is 
particularly efficient and its generality and simplicity allow for the use of important a priori 
information in real-world studies. The dataset used in our analysis technique needs to satisfy 
some statistical requirements. First, the space and time sampling needs to be adequate 
to the description of the space and time variability of the sensitivities that originate the 
process feedbacks, so that the assumption that the sensitivities are constant over one 
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time step is an account approximation. Using too coarse time sampling is equivalent to 
using time-averaged data, which mixes many physical processes and ruins the sensitivity 
estimates. Using space- averaged data is also dangerous; for example, a mean sensitivity 
equal to zero could be generated by two opposite regimes with non-zero sensitivity. In 
other words, even if we are studying the longer-term behavior of the system, we must 
resolve the dynamics appropriately or the nonlinear integration will be incorrect. A study 
on the space/time variability of the sensitivities is then a prerequisite for the definition of a 
dataset sampling for feedback analysis. Second, the dataset has to have a good space and 
time coverage in order to represent as many climatological situations as possible. In other 
words, the dataset should contain all possible combinations of the state variables. Th e 
more situations in the dataset, the better will be the “laws” inferred by the analysis. These 
two points are a major argument to use actual, very large, long-term, datasets instead of 
generating new ones, limited in time. Moreover, these comments mean that the dynamics 
of the system cannot be correctly deduced from datasets where individual quantities have 
been separatelly averaged on space and time. 

Our technique has the advantage of being applicable to numerical model data as well 
as observations, which means that the important work of inter-comparison of models and 
of validation of models could be carried out with a meaningful measure: the sensitivities of 
the variables of the system. This diagnostic measure is particularly interesting because it 
concerns very intuitive and physical quantities. Comparisons of the sensitivity relationships 
could also be made with field experiment data to understand how physical processes 
product these sensitivities. Thus, our analysis approach provides a framework for a whole 
new attack on these problems. 

The statistical model estimating the sensitivities can also be used to study the response 
of the system new equilibrium state, including the time to reach equilibrium after a small 
perturbation. This simplified model could also be used to analyze the propagation of 
uncertainties when predictions are performed. In other words, the neural network statistical 
model provides a better approximation of “small perturbation” behavior than attempts 
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to linearize the system by dropping relationships. The next step of these ideas is to use 
this new technique for more complicated climate systems involving real observations or 
numerical model outputs. 
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Figure 1. The stabilized values lim X(t 0 + k At) of a mono- variable linear system for 

k->+o6 

different values G of the gain of the s ys tem and with external forcing E 0 = 1 
Figure 2. Feedback loops system in p a rallel 

Figure 3. Analysis of the gain Gf oTthe system as a function of the unique feedback 

factor /, with G — 0.5 

Figure 4. Feedback loops system in series 

Figure 5. Architecture of a MLP neural network with L layers, with inputs P and 
outputs X 

Figure 6. Lorenz model, with parameters a = 0.25, 6 = 4, F\ = 8, F 2 = 1 and A t — 0.08, 
simulated by fourth-order Runge-Kutta 

Figure 7. Fourth order Runge-Kutta (continuous lines) and discretized Lorenz model 
(dashed lines) with parameters a = 0.25, b = 4, F\ = 8, F 2 = 1 and At = 0.08 
Figure 8. Noisy trajectories of the dataset from the discrete Lorenz dynamical system 
Figure 9. Representation of the theoretical Lorenz Dynamical operator (continuous 
line), its neural network estimate (dotted lines), and its linear regression estimate (dashed 
lines) 

Figure 10. RMS error for the estimation of the dynamical Lorenz operator: neural 
network regression (continuous lines), and linear regression (dashed lines) 

Figure 11. Prediction RMS error for the Neural Network regression (continuous line) 
and the Linear regression (dashed line) as a function of the forecast range (in time steps 
At = 0.08) 

Figure 12. An example of prediction with a forecast range of 24 time steps 

Figure 13. Standard-Deviation of the sensitivities of the discrete Lorenz dynamical 

system 

Figure 14. Root Mean Square error for the sensitivities estimates: neural network-based 
estimates (continuous lines) and linear regression-based estimates (dashed lines) 



Figure 15. Discrete Lorenz model (continuous line) and discrete Lorenz model minus 
the sensitivity (dashed line) 

Figure 16. Jacobians evolution through time: theoretical Jacobians (continuous line), 
linear regression based estimates (dotted lines), and neural network based estimates 
(dashed lines) 

Figure 17. Feedback factors evolution through time 



Table 1 . Statistics on true and retrieved sensitivities 


Sensitivity Statistics 

Theoretical Linear Neural Network 

Mean 
Std-Dev 
RMS Error 

0.980 0.973 0.981 

0.000 0.000 0.003 

0.007 0.003 

Mean 

aA '( <+1 ) Std-Dev 

dY(t) OLU 

RMS Error 

-0.077 -0.025 -0.076 

0.133 0.000 0.132 

0.144 0.004 

Mean 

W Std - Dev 

RMS Error 

-0.057 -0.064 -0.057 

0.146 0.000 0.145 

0.147 0.004 

Mean 

^ Std-Dev 

RMS Error 

-0.077 0.014 -0.077 

0.297 0.000 0.297 

0.310 0.003 

Mean 

^ Std-Dev 

RMS Error 

0.955 0.979 0.956 

0.048 0.000 0.043 

0.054 0.003 

Mean 

snap std-Dev 

RMS Error 

-0.141 -0.133 -0.141 

0.192 0.000 0.193 

0.193 0.003 

Mean 

S§±il Std-Dev 

RMS Error 

0.184 0.259 0.184 

0.281 0.000 0.281 

0.291 0.004 

Mean 

Sflil Std-Dev 

RMS Error 

0.141 0.226 0.142 

0.192 0.000 0.192 

0.210 0.003 

Mean 

Sfdil Std-Dev 

RMS Error 

0.955 0.962 0.955 

0.048 0.000 0.048 

0.049 0.003 
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Figure 1 . The stabilized values lim X(t 0 4- k At) of a mono- variable linear system for 

fc-4-f OC 

different values G of the gain of the system and with external forcing E 0 = 1 
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Figure 2. Feedback loops system in parallel 
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Figure 3. Analysis of the gain Gj of 
factor /, with G = 0.5 
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Figure 4. Feedback loops system in series 








OOOOOOOOOG 


I 


LAYER 0 



LAYER ] 


©-* 

O - 
O- 
O- 

* 8 : 
O- 
o- 
o - 
o- 

LAYER L 


OUTPUTS 

X 


Figure 5. Architecture of a MLP neural network with L layers, with inputs P and 
outputs X 









Figure 7. Fourth order Runge-Kutta (continuous lines) and discretized Lorenz model 
(dashed lines) with parameters a = 0.25, b = 4, F\ = 8, = 1 and At = 0.08 














Figure 9. 


Representation of the theoretical Lorenz 


Dynamical operator (continuous 


line), its neural network estimate (dotted lines), and its linear regression estimate (dashed 
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Figure 10. RMS error for the estimation of the dynamical Lorenz operator: neural 
network regression (continuous lines), and linear regression (dashed lines) 
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Figure 11. Prediction RMS error for the Neural Network regression (continuous line) 
and the Linear regression (dashed line) as a function of the forecast range (in time steps 
At = 0.08) 
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Figure 12. An example of prediction witfija Forecast range of 24 time steps 
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Figure 14. Root Mean Square error for the sensitivities estimates: neural network-based 
estimates (continuous lines) and linear regression-based estimates (dashed lines) 
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Figure 15. Discrete Lorenz model (continuous line) and discrete Lorenz model minus 
the sensitivity (dashed line) 









Figure 16 . Jacobians evolution through time: theoretical Jacobians (continuous line), 
linear regression based estimates (dotted lines), and neural network based estimates 
(dashed lines) 












